Global nitrogen deposition inputs to cropland at national scale from 1961 to 2020

Nitrogen (N) deposition is a significant nutrient input to cropland and consequently important for the evaluation of N budgets and N use efficiency (NUE) at different scales and over time. However, the spatiotemporal coverage of N deposition measurements is limited globally, whereas modeled N deposition values carry uncertainties. Here, we reviewed existing methods and related data sources for quantifying N deposition inputs to crop production on a national scale. We utilized different data sources to estimate N deposition input to crop production at national scale and compared our estimates with 14 N budget datasets, as well as measured N deposition data from observation networks in 9 countries. We created four datasets of N deposition inputs on cropland during 1961–2020 for 236 countries. These products showed good agreement for the majority of countries and can be used in the modeling and assessment of NUE at national and global scales. One of the datasets is recommended for general use in regional to global N budget and NUE estimates.

calculated by overlaying the modelled N deposition maps with a cropland distribution map, as well as country boundaries (Table 1). However, the spatial and temporal coverages of these maps vary among studies, and the precision of the maps varies too. Based on a summary of the methods, data availability, and survey of expert opinions, we chose to use two different global N deposition maps (i.e., Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP) 10 and Wang et al. [18][19][20] and two cropland maps (i.e., Land-Use Harmonization 2 (LUH2) 21 and History of the Global Environment database (HYDE) 22 ) in this study (Tables 1 and 2). While some region-specific models that emphasize cropland [23][24][25] might yield a more precise evaluation of N deposition, their restricted spatial extent doesn't align with the objectives of this study, which seeks to quantify N deposition for the majority of countries worldwide. In addition, few regional models provide a time span that extends over the extensive period from 1961 to 2020. Combining these different sources of maps resulted in four products of N deposition inputs on cropland at annual scale for 236 countries for the period 1961-2020. We compare our N deposition estimates with existing literature and ground observations, and discuss the four products and their usability. The four products developed were the results of the combinations of four maps: AH: ACCMIP + HYDE, AL: ACCMIP + LUH2, WH: Wang et al. + HYDE, and WL: Wang et al. + LUH2. On balance, we recommend using the WL data product for global estimates of N deposition on cropland. This data product was included in the Cropland Nutrient Budget database, a joint release by the Food and Agriculture Organization of the United Nations (FAO), the International Fertilizer Association (IFA), and various research groups 26 , available at https://www.fao.org/faostat/en/#data/ESB.

Methods
Data sources. The datasets used for developing N deposition at a national scale include N deposition maps, and cropland maps. Both types of maps have different spatiotemporal resolutions ( Table 2). The N deposition maps from ACCMIP include dry and wet deposition of NHy and NOx, while Wang et al. comprise bulk N deposition without discriminating between the types of N deposition. Cropland maps (i.e., LUH2 and HYDE) with finer resolution were adjusted by summing their values to a coarser resolution to match the grid resolution of the N deposition map. Since the N deposition data from Wang et al. are available for a shorter time period than the period targeted in this study, we used N emission data from the Emissions Database for Global Atmospheric Research (EDGAR) 27 to extrapolate and interpolate the values of N deposition to complete the time series from 1961-2020. We compared our national scale estimates with observed N deposition data from measurement station networks in the UK, China, the USA, and East Asia, and with the N deposition estimates used in 14 N global budget datasets from Zhang et al. 14 .
The ACCMIP dataset is a multi-model ensemble dataset providing the mean of N deposition across 11 models 10 . Out of 11 models, 10 models include NOx chemistry, while only 5 models include NHy chemistry. All the models varied in their spatial resolution to model deposition (see supplementary information in Lamarque et al. 10 ). Both dry and wet depositions of oxidized and reduced N are estimated and simulated for five time slices: 1850,1980,2000,2030, and 2100 years. The deposition estimates are averaged across models that were originally at monthly time scale. The emissions used for modeling deposition include anthropogenic (including shipping and aircraft), biomass burning, and natural emissions (such as soil NOx). Natural emissions were not harmonized across models. The models are calibrated for the 1980-2000 period and represent climate change in increments of 10 years rather than a specific meteorological year. Only the model estimates of wet deposition are evaluated against measurement stations in North America, Europe, and Asia with no information on the number of sites. This comparison showed ACCMIP results lower in North America and Europe, and worse in Asia compared with previous studies 10 .
In contrast to ACCMIP, Wang et al. data, which also have dry and wet N depositions for the period of 1850-2100, were quantified by LMDZ-OR-INCA (Laboratoire de Météorologie Dynamique -OR-INteraction with    www.nature.com/scientificdata www.nature.com/scientificdata/ and NHy). To capture N in aerosols and gases and simulate global dry and wet deposition, the LMDZ-INCA was run at a spatial resolution of 1.27° latitude by 2.5° longitude for the years 1850, 1960, 1970, 1980, 1990, and 1997-2013. The detailed methodology to model transport and removal processes is in Wang et al. 20 . Model evaluation was performed using recent global modeled datasets on wet N deposition rates 9 , while the evaluation of dry deposition was not performed due to a lack of data 30,31 . The evaluation of wet deposition against station measurement includes North America, Europe, Africa, Asia, South America, and forest areas. The major difference in the ACCMIP and Wang et al. lies in the emission data used to model the deposition, that leads to differences in the N deposition estimates.
Between the two cropland maps, HYDE is created using simple time-varying land allocation algorithms. It develops land use maps (e.g., cropland) from 10,000 BCE to present. The HYDE dataset is developed for each 100-year from 800 to 1700, at a decadal scale between 1700 and 2000, and at an annual scale from 2000 to 2015.
The key input data include historical national level "arable land and permanent crops" and "permanent meadows and pastures" from the Food and Agriculture Organization of the United Nations (FAO) 32 . Additionally, subnational datasets are utilized. These coarser statistics on land use are then converted to finer resolution croplands using European Space Agency (ESA) Land Cover Consortium maps 33,34 and following a sequence of allocation steps. The ESA land cover provides a land mask for HYDE and is available for three years, but the most recent epoch is used to develop the dataset. For more details on steps, see Goldewijk et al. 22 . At a national scale, HYDE dataset has shown consistent estimates with FAO's land use data but suffers from larger differences compared to other satellite-based products 35 .
The input dataset in LUH2 land-use states includes the HYDE database to develop a historic dataset. The data from HYDE are on a decadal scale from 1700 to 2000 and annually from 2000 to the present. This data was linearly interpolated to establish an annual time series of gridded cell area fractions of different land use types  www.nature.com/scientificdata www.nature.com/scientificdata/ (e.g., cropland). The cropland area fraction includes five different functional types: C3 annuals, C4 annuals, C3 perennials, C4 perennials, and C3 nitrogen fixers. This dataset also takes crop rotation and agricultural management practices into account. Although LUH2 is an improvement over HYDE, it also underestimates cropland area in some regions compared to other studies 36 .
The goal for comparison against site-based observations is to see whether our national N deposition estimates fall in the range of the site-based observations in cropland. Although we don't anticipate our national estimates to exactly match the values and trends of the measurements from these cropland sites, we do expect them to align within the observed range (i.e., have a similar magnitude and/or exhibit comparable overall trends). Since our study focuses on cropland, we limited our comparison to the sites that are located either in croplands or agricultural lands. Such selection criteria reduce the number of available measurement stations. Additionally, the majority of sites provide wet deposition estimates, and dry deposition measures are not available in the station network sites due to a lack of data 20,30,31 . Despite this fact, we included NADP, ECN, and China's N wet deposition for comparison. Except for a few sites in EANET that are located in rural areas with both dry and wet deposition 37 ; rest of the sites have data for wet precipitation chemistry. These limitations indicate that the measurement of N deposition on agricultural land requires both better spatial coverage worldwide and greater standardization of protocols and data. Hence, for comparison on agricultural/crop land, we have collected and compared our estimates with long-term data on N deposition from the USA, China, the UK, and East Asian sites.
Methods for estimating N deposition. Following previous studies, the N deposition is estimated by overlaying the N deposition maps, cropland distribution maps, and country boundaries (Fig. 2). Then, using cropland area as weights within a country's boundary, we aggregate N deposition to a national scale. The N deposition data from Wang et al. are available as bulk estimates, whereas the ACCMIP dataset provides deposition in separate forms (i.e., NOx, and NHy). For separate cases, the N deposition is calculated for different forms of reactive N, and then aggregated together to represent bulk N deposition.
Adjustments to cropland maps. HYDE cropland maps are available at decadal scale from 1960 to 2000, and at annual scale after 2000, until 2017 (Table 2). To create a continuous time series of N deposition, the following adjustments were done to the cropland map: www.nature.com/scientificdata www.nature.com/scientificdata/ forest and non-forest sub-types, pasture into managed pasture and rangeland, and cropland into multiple crop functional types. The fractions falling under the category of cropland (i.e., C3 annual crops, C3 perennial crops, C4 annual crops, C4 perennial crops, C3 nitrogen-fixing crops) were chosen, and the maximum fraction among these five sub-categories was selected to represent cropland. The fraction was converted to land area by multiplying the fraction of cropland in a grid cell by the grid cell's spatial resolution. LUH2 maps are available at annual scale until 2015. As a result, the cropland maps between 2015 and 2020 are assumed to be the same as in 2015.

Extrapolation and interpolation of N deposition maps.
where Y is the year, and β 0 and β 1 are intercept and slope, respectively. 2. Interpolation between 1970-1997: Both the N emission and N deposition maps are available between the years 1970-1997 but for different sets of years. N emissions maps are available annually from 1970 to 1997, while N deposition is only available for four years (i.e., 1970, 1980, 1990 and 1997). Using the common years (i.e., 1970, 1980, 1990 and 1997) considering both the datasets, a relationship between N emission and N deposition was established (Eq. 2). This relationship was used to interpolate values of N deposition for years other than 1970, 1980, 1990, and 1997 using N emission maps that are available annually. These steps were conducted at gridded level and the deposition rates were aggregated to national scale using the approach mentioned in section "Methods for estimating N deposition".
Validation of the N deposition products. We used two approaches to validate the N deposition inputs from the four data products. First, we compared the N deposition data products in this study with the site-level observations. Second, we compared the four data products with the N deposition estimated by the other global N budget studies. For the first approach, we collected observation records of N deposition flux over the agricultural lands across China, the UK, the USA, and East Asia including 268, 4, 357, and 7 stations, respectively ( Fig. 3 and Table 2). Except the UK, the N deposition flux from the station networks in other regions is expressed similarly to the products in this study. In the United Kingdom, six stations of the Environmental Change Network (ECN: www.nature.com/scientificdata www.nature.com/scientificdata/ https://ecn.ac.uk) are located on agricultural land (i.e., Drayton, Hillsborough, North Wyke, Porton, Rothamsted, and Wytham). Among them, only Drayton, North Wyke, Rothamsted, and Wytham have precipitation chemistry data available for ammonium, nitrate, and total nitrogen deposition (Fig. 3c). The N deposition data from each station are available on a daily scale in mg L −1 . To convert the unit to kg N ha −1 yr −1 , the following approach was followed: where n is the number of days with available data in a year from all stations. The daily precipitation data were obtained from ECMWF 38 . Each station's precipitation corresponds to the precipitation value in the grid cell adjacent to the respective station.
For the second approach, we compared the N deposition estimates in this study with the N deposition from 14 global nitrogen budget datasets represented collectively in Zhang et al. 14  Sensitivity analysis. The sensitivity analysis examined how N deposition estimates from the four different products affect the assessment of NUE for each country over the past decades. The NUE is defined as the ratio of N removal by the crop (CR) to the sum of N inputs from synthetic fertilizer (SF), manure (MN), biological N fixation (BNF), and atmospheric N deposition (AD) (Eq. 3). We used N fertilizer, N manure, N fixation from FAOSTAT cropland nutrient budget 26 , while varying the N deposition estimates based on the four products developed in this study. Since the FAOSTAT cropland nutrient budget dataset 26 already includes N deposition from WL, we estimated a reference NUE with WL, and compared this NUE with the NUE estimated by replacing the N deposition with other three products (i.e., AH, AL, and WH) while keeping the remaining three elements (i.e., SF, MN, and BNF) of the N budget the same.
With the four sets of NUE (i.e., NUE WL , NUE AH , NUE AL , and NUE WH ), two parameters were estimated to evaluate the changes in NUE due to differences in N deposition.   Fig. 6 Comparison of the product AL (i.e., ACCMIP + LUH2) and other studies' total N deposition (Table 1)
The data contain N deposition (kg N ha −1 yr −1 ) in cropland. Each file has rows for countries and columns for years. The NaN stands for "Not a Number".  (Fig. 4). For example, the difference in global N deposition between ACCMIP and Wang et al. in the year 2000 was approximately 3 Tg N yr −1 , which increased over time, particularly after 1992. These variations are primarily caused by the differences in simulation and modeling approaches to develop N deposition maps using emission inventories. Regardless of which deposition maps were used, the N deposition estimates with the HYDE map as weights were consistently lower than compared to using the LUH2 map. However, the differences between the two maps derived from Wang et al. deposition estimates (i.e., WH vs. WL) are small.
When comparing N deposition for a specific year, these differences on a global scale are also visible spatially. Figure 5a,b shows an example of estimates for the year 2000 using ACCMIP deposition data in combination with two cropland maps (i.e., HYDE and LUH2) on the national scale. The estimates for most countries appear to be the same, apart from Indonesia, which is higher with LUH2 instead of HYDE. Similar to ACCMIP, N deposition estimates at country scale obtained from Wang et al. are shown in Fig. 5c    Evaluation of the N deposition products. Large variation exists in the national N deposition estimates from different products. Such uncertainty stems primarily from the various approaches and assumptions used to estimate N deposition. The AL product is at the lower end of the distribution of N deposition from 14 previously developed estimates, whereas WL is either around the median or on the higher side (Figs. 6 and 7). For most regions, the WL is in the interquartile range of 14 other studies. Europe's estimates are outside the interquartile range and on the lower end. However, with AL, the estimates for Europe are slightly higher and within the interquartile range. In the remaining regions, AL is almost always lower than the 14 other nitrogen deposition estimates. Overall, WL estimates compare favorably to the remaining 14 N budget datasets with N deposition. comparison with station networks. All four data products either slightly overestimated or were within the range of station records of N deposition. However, for most practical purposes these differences are probably acceptable.
In the USA, all four products overestimated N deposition from the station network (Fig. 8a). In fact, until 2002, the values fall outside the upper bounds of the station records. The four products appear to be within the range for years 2003, 2011, 2013, and 2018. They are, however, above the 75 th percentile. Hence, a clear indication of overestimation is seen in the USA, but, for most application purposes, the products' N deposition estimates are only few kg N per ha cropland higher than station records. www.nature.com/scientificdata www.nature.com/scientificdata/ In China, products utilizing the ACCMIP-based deposition map were not only substantially lower than the Wang et al.-based products, but they were also at the lower end of the distribution of the station records. The products based on Wang et al. were comparable to station data estimates until 2008 but appear to be higher than the 75 th percentile after 2008 (Fig. 8b).
In the UK, the N deposition products from both ACCMIP and Wang et al. fell within the range of the observed N deposition station records prior to 2009 and were approximately 5 kg N ha −1 higher than the median of the observed values after 2010 (Fig. 8c). In the observation data, a few cases have exceptionally high values of N deposition, which is due to the higher precipitation on some days in a year resulting in higher N deposition.
For countries in the EANET ground station network, there is either one or two stations data available in the rural sites to evaluate the N deposition on agricultural crops 37 . In these countries, the N deposition products from Wang et al. are either lower or similar to ACCMIP estimates except in Vietnam and Thailand (Fig. 9). This indicates underestimation of N deposition in some of the Southeast Asian countries from the Wang et al. products. Moreover, none of the estimates from both ACCMIP and Wang et al. are outside the range of N deposition from the sites. This shows a reasonable values of N deposition from the products developed in this study.
Overall, the total N deposition (dry + wet) from four products in China and the UK are close to the median wet deposition estimates from station networks (Table 3). However, we did not find a close match in the USA. The main reason for the systematically lower ground station values in the USA is that they do not include dry deposition. Although the values of our products in China fall close to the median, after 2010, the bulk N deposition estimates in China (Fig. 8b) from Wang et al.-based products are rising above the median of the deposition record from station network. This could be because of recent rising emissions in China and its direct correlation with higher deposition rates 55 . Hence, we think if dry depositions are also accounted for in the sites, the deposition from four products might match the site values. Furthermore, the UK's emissions (rainfall rate and frequency) are lower (higher) than the USA and China 27 . That's why, the deposition values from our products are matching with the wet deposition in the UK. Year www.nature.com/scientificdata www.nature.com/scientificdata/ Sensitivity analysis. The differences in NUE are higher with the N deposition map from ACCMIP (Fig. 10). When using either HYDE or LUH2 cropland map as weights for aggregation of ACCMIP N deposition maps, the difference in NUE is closer to zero for most of the countries except for a few countries in Africa and Eurasia. In contrast, the NUE difference between WH and WL products is in the range of 0 and 0. Similar to differences in NUE, the correlation between NUE WL and NUE estimated by using the other three products did not vary a lot. Despite replacement of N deposition from AH, AL, and WH in the NUE estimation, all products show high correlation (r > 0.8) in the majority of countries when the reference NUE (i.e., NUE WL ) for the period of 1961-2020 obtained from FAOSTAT's cropland nutrient budget dataset 26 were correlated with the other three N deposition products (Section "Sensitivity Analysis", and Fig. 11). This indicates less sensitivity of NUE and the general usability of any of these products for global scale NUE assessment.

Usage Notes
Within this study, we developed four N deposition products at national scale; AH: ACCMIP + HYDE, AL: ACCMIP + LUH2, WH: Wang et al. + HYDE, and WL: Wang et al. + LUH2. We recommend using the WL product for most general studies of global and regional cropland N budgets and NUE. It has already been adopted in the Cropland Nutrient Budget database jointly released 26 by FAO and IFA in 2022, for the following reasons: (1) the Wang et al. estimates are closer to the remaining existing datasets for the majority of regions, particularly Asian countries, whereas the estimates obtained using ACCMIP are on the lower side of the distribution of N deposition compared to other datasets, with minimal change after 2005, (2) the LUH2 cropland map is available at an annual time scale, which serves the purpose of this study, while the HYDE map is only available  Table 3. Comparison of measured and estimated annual N deposition (kg N ha -1 yr -1 ) for year 2000. Note: EANET is not compared here because of single station value available in each country. www.nature.com/scientificdata www.nature.com/scientificdata/ at a decadal scale until 2000, and annually from 2000-2017, and (3) the WL product was within the range of ground estimates of N deposition from the station networks in multiple countries (i.e., China, the UK, the USA, and East Asia). Overall, however, the sensitivity analysis showed small impacts of the four N deposition products on estimating NUE at national to global scales.
The raw N deposition maps used in this study are derived from N emissions. Emissions may originate from a variety of sources, including transportation, wastewater handling, direct soil emissions etc. 27 . Our country-specific deposition estimates, which rely on the rate of emissions, may have overlap effects (i.e., N deposition in a country could include re-deposition of N emission either within a country boundary or from other neighboring countries) because these emissions are not border-restricted. Hence, the four products in this work do not pertain an equal relationship to national emission rates, which could be lower or higher.
Admittedly, the existing N deposition data products are still limited in providing more accurate global scale estimates. Improving them will require better field-scale measurements techniques 9 and networks, modeling improvements 10 , and updated cropland maps 56 .